Compressive Estimation of Doubly Selective 
Channels in Multicarrier Systems: Leakage 
Effects and Sparsity-Enhancing Processing 

Georg Taubock, Member, IEEE, Franz Hlawatsch, Senior Member, IEEE, Daniel Eiwen, Student Member, IEEE, 

and Holger Rauhut 



o 



C/2 



> 

l> 
(N 

cn 
o 

ON 

o 



Abstract — We consider the application of compressed sensing 
(CS) to the estimation of doubly selective channels within pulse- 
shaping multicarrier systems (which include OFDM systems as a 
special case). By exploiting sparsity in the delay-Doppler domain, 
CS-based channel estimation allows for an increase in spectral 
efficiency through a reduction of the number of pilot symbols. 
For combating leakage effects that limit the delay-Doppler 
sparsity, we propose a sparsity-enhancing basis expansion and a 
method for optimizing the basis with or without prior statistical 
information about the channel. We also present an alternative CS- 
based channel estimator for (potentially) strongly time-frequency 
dispersive channels, which is capable of estimating the "off- 
diagonal" channel coefficients characterizing intersymbol and 
intercarrier interference (ISI/ICI). For this estimator, we propose 
a basis construction combining Fourier (exponential) and prolate 
spheroidal sequences. Simulation results assess the performance 
gains achieved by the proposed sparsity-enhancing processing 
techniques and by explicit estimation of ISI/ICI channel coeffi- 
cients. 

Index Terms — channel estimation, compressed sensing, 
CoSaMP, dictionary learning, doubly selective channel, inter- 
carrier interference, intersymbol interference. Lasso, multi- 
carrier modulation, orthogonal frequency-division multiplexing 
(OFDM), orthogonal matching pursuit (OMP), sparse reconstruc- 
tion. 



I. Introduction 

The recently introduced principle and methodology of com- 
pressed sensing (CS) allows the efficient reconstruction of 
sparse signals from a very limited number of measurements 
(samples) [1,2]. CS has gained a fast-growing interest in 
applied mathematics and signal processing [3]. In this paper. 
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we apply CS to the estimation of doubly selective (dou- 
bly dispersive, doubly spread) channels. We consider pulse- 
shaping multicarrier (MC) systems, which include orthogonal 
frequency-division multiplexing (OFDM) as a special case [4, 
5]. OFDM is part of, or proposed for, numerous wireless 
standards like WLANs (IEEE 802.11a,g,n, Hiperlan/2), fixed 
broadband wireless access (IEEE 802.16), wireless personal 
area networks (IEEE 802.15), digital audio and video broad- 
casting (DAB, DRM, DVB), and future cellular communica- 
tion systems (3GPP LTE) [6-11]. 

Coherent detection in such systems requires channel state 
information (CSI) at the receiver Usually, CSI is obtained 
by embedding pilot symbols in the transmit signal and using 
a least-squares (LS) [12] or minimum mean-square error 
(MMSE) [13] channel estimator More advanced channel es- 
timators for MC transmissions include estimators employing 
one-dimensional (1-D), double 1-D, or two-dimensional (2-D) 
MMSE filtering algorithms [14-16]; 2-D irregular sampling 
techniques [17]; or basis expansion models [18-20]. The 
CS-based ("compressive") channel estimation methodology 
proposed in this paper exploits the fact that doubly selective 
multipath channels tend to be dominated by a relatively 
small number of clusters of significant paths, especially for 
large signaling bandwidths and durations [21]. Conventional 
methods for channel estimation do not take advantage of this 
inherent sparsity of the channel. In [22,23], we proposed 
CS-based techniques for estimating doubly selective channels 
within MC systems. We demonstrated that CS provides a way 
to exploit channel sparsity in the sense that the number of 
pilot symbols that have to be transmitted for accurate channel 
estimation can be reduced. Transmitting fewer pilots leaves 
more symbols for transmitting data, which yields an increase 
in spectral efficiency. 

For sparse channel estimation, several other authors have 
independently proposed the application of CS methods or 
methods inspired by the literature on sparse signal repre- 
sentations [21,24-31]. Both [24] and [26] considered single- 
carrier signaling and proposed variants of the matching pursuit 
algorithm [32] for channel estimation. The results were pri- 
marily based on simulation and experimental implementations, 
without a CS theoretical background. The channel estima- 
tion techniques presented in [24,27,28] limited themselves 
to sparsity in the delay domain, i.e., they did not exploit 
Doppler sparsity. The recent work in [29] and its extension 
to multiple-input/multiple-output (MIMO) channels [30], on 
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the other hand, considered both MC signaling (besides single- 
carrier signaling) and sparsity in the delay-Doppler domain, 
somewhat similar to [22]; however, a different CS recovery 
technique was used. In [33], it is shown experimentally for 
MC communications over underwater acoustic channels that 
compressive channel estimation outperforms traditional sub- 
space algorithms (root-MUSIC and ESPRIT). 

In this paper, extending our work in [22, 23], we present CS- 
based techniques for estimating doubly selective channels that 
are potentially strongly time- and/or frequency-dispersive. In 
MC systems, strong channel dispersion may cause intersymbol 
interference (ISI) and/or intercarrier interference (ICI) [4]. 
One of the proposed techniques enables the estimation of 
ISI/ICI channel coefficients. We first present a basic com- 
pressive estimator for mildly dispersive channels that yields 
estimates of the "diagonal" channel coefficients. Our focus 
is on leakage effects that limit the delay-Doppler sparsity, and 
which have not been considered in [21,24-31]. For combating 
leakage effects and, hence, enhancing sparsity, we then replace 
the discrete Fourier transform (DFT) used in conventional 
compressive channel estimation by a more suitable basis 
expansion. We also develop an iterative basis-optimization 
procedure that is similar in spirit — but not algorithmic ally — to 
dictionary learning techniques recently proposed in [34-36]. 
This procedure is able to take into account prior statistical 
information about the channel. Next, we present an alternative 
compressive method for estimating also the "off-diagonal" 
ISI/ICI channel coefficients of potentially strongly dispersive 
channels (e.g., highly mobile wireless channels or underwater 
acoustic channels [26,33]). Here, motivated by [20,37], we 
propose a sparsity-enhancing basis expansion that combines 
Fourier (exponential) and prolate spheroidal sequences. 

This paper is organized as follows. In Section |II] we 
describe the MC system model. In Section Hill we present the 
basic compressive estimator for mildly dispersive channels. 
An analysis of delay-Doppler leakage and its effect on delay- 
Doppler sparsity is performed in Section HV] A sparsity- 
enhancing basis expansion and a framework and iterative 
algorithm for optimizing the basis (with or without prior 
statistical information about the channel) are developed in 
Sections [V] and IVII respectively. In Section IVIII we propose a 
compressive estimator and a basis expansion for (potentially) 
strongly dispersive channels. Finally, simulation results pre- 
sented in Section [Villi assess the performance gains achieved 
by the proposed sparsity-enhancing basis expansions and by 
the estimation of ISI/ICI channel coefficients. 



II. MuLTicARRiER System Model 



We assume a pulse-shaping MC system for the sake of 
generality and because of its advantages over conventional 
cycHc-prefix (CP) OFDM [4, 38^1]. This framework includes 
CP-OFDM as a special case. The complex baseband domain 
is considered throughout. 



A. Modulator, Channel, Demodulator 

The MC modulator generates the discrete-time transmit 
signal [4] 

L-l K-l 

= X! X! "',fc5i,fcN > (1) 

;=0 A:=0 

where L and K denote the numbers of transmitted MC 
symbols and subcarriers, respectively; a/^fe (/ = 0, . . . , 
1; k = 0, 1) denotes the complex data symbols, 

drawn from a finite symbol alphabet A; and gi^k[n] = g[n — 
IN] gj'^'^Hn-iN) / K ^ time-frequency shift of a transmit pulse 
g[n] {N>K is the symbol duration). Using an interpolation 
filter with impulse response /i(t), s[n] is converted into the 
continuous-time transmit signal 

oo 

s{t) = s[n]h{t~nTl), (2) 

n— — oo 

where % is the sampling period. This signal is transmitted 
over a noisy, doubly selective channel, at whose output the 
receive signal 

/oo 
h{t,T)s{t-T)dT + z{t) (3) 
-oo 

is obtained. Here, h{t, t) is the channel's time-varying impulse 
response and z{t) is complex noise. At the receiver, r{t) is 
converted into the discrete-time receive signal 

/oo 
r{t)h{nT,-t)dt, (4) 
'OO 

where f2{t) is the impulse response of an anti-aliasing filter. 
Subsequently, the MC demodulator calculates the "demodu- 
lated symbols" 

oo 

71 — — OO 

? = 0, fc = 0,...,A'-l. (5) 

Here, ji,k[n] = j[n — /TV] e-'^'^'^*""'^)/-^ with a receive pulse 
7[n]. Finally, the demodulated symbols n.^ are equalized and 
quantized according to the data symbol alphabet A. 

Combining (|2]i-(|4|i, we obtain an equivalent discrete-time 
channel that is described by the following relation between 
the discrete-time signals s[n] and r[n]: 

oo 

r[n] = h[n,m\s[n — m] + z[n] , (6) 

m— — oo 

with the discrete-time time-varying impulse response 
h[n,m] = J^^J^^hit + nT,,T)hit-T + m%)f2i-t) dtdr 
and the discrete-time noise z[n] = z{t) f2{nTs — t)dt. 

CP-OFDM is a simple special case of the pulse-shaping MC 
framework; it is obtained for a rectangular transmit pulse g^n] 
that is 1 for n = 0, . . . , A'— 1 and otherwise, and a rectangular 
receive pulse 7[n] that is 1 for n = N — K, . . . , N ~1 and 
otherwise {N-K >0 is the CP length). 
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B. System Channel 

Next, we consider the equivalent system channel that sub- 
sumes the MC modulator, interpolation filter, physical channel, 
anti-aliasing filter, and MC demodulator. Combining (|5]), (|6]l, 
and ([T]i, we obtain 

TLk = ^ ^ Hi^k-i'M' ai'^k' + zi^k , 

11=0 k'=0 

l = 0,...,L-l, k = 0,...,K-l, (7) 

with zi^k = (z^Lk) = Er=-oo^["]7r,fcN- The system 
channel coefficients Hi k-i'.k' describe ICI for k^k' and 1 = 1' 
and ISI for / ^ they can be expressed in terms of h[n, to], 
g[n\, and 7[n] [4]. 

Let 7[n] be zero outside {0, . . . , L^}. To compute ^ in (jSj 
for 1 = 0,..., 1^1, we need to know r[n] for n = 0, . . . , A'^^^ 1, 
where = {L—1)N + L^ + 1. In this interval, we can rewrite 
© as 

oo Af^-l 

^["] ^ 5^ 5/i[to, i] s[7i — to] e-'^'^"™^ + z[n] , (8) 

m— — oo z^O 

with the discrete-delay-Doppler spreading function [42] 

5'/jTO,i] = — ^/i[n,TO]e ^^'""r , to, i G Z , (9) 

n=0 

which represents the channel in terms of discrete delay (time 
shift) TO and discrete Doppler frequency shift i. Combining 
(|5]l, (|8j, and ([T]i, and assuming that h[n, to] is causal with 
maximum delay at most A'— 1, i.e., h[n,m] = for to ^ 
{0, . . . ,K—1}, we reobtain the system channel relation (|7]i, 
however with the system channel coefficients Hi,k-i'.k' now 
expressed in terms of the delay-Dopler representation Sh[m, i]. 
Specializing this expression to {I', k') = (/, k) and using the 
approximation iV^ w LN (which is exact for CP-OFDM) 
yields the following expression for the diagonal channel coef- 
ficients Hi k — Hi^k;i,k {L is assumed even for mathematical 
convenience): 

A'-l L/2-1 

Him = Y. F[mAe-^^<'^-'-i\ 

m=0 i=-L/2 

l = 0,...,L-l, fc = 0,...,A'-l, (10) 

with 

N-i i + qL 

q=0 

(11) 

Here, A^Jm,^) ^ EZ-ooli^] 9*[n - m]e-^^-^^ is the 
cross-ambiguity function [43] of 7[n] and g[n]. 

III. Compressive Channel Estimation 

We now present the basic compressive channel estimation 
method [22,29]. This method enables estimation of the diag- 
onal channel coefficients Hi k = Hi^k;i,k, which is sufficient 
for mildly dispersive channels. 



A. Pilot-assisted Channel Estimation 

Our goal is to estimate the system channel coefficients 
Hi.k = Hi^k;i,k from the system channel output n.^, aided 
by some known pilot symbols. For practical (underspread 
[42]) wireless channels and practical transmit and receive 
pulses, F[TO,i] in (fTTT i is effectively supported in a subregion 
of the delay-Doppler plane. Thus, hereafter we assume that 
the support of F[m,i] (within the fundamental i period 
{— L/2, . . . , i/2 — 1}; note that F[m,i] is L-periodic in i) 
is contained in {0, . . . , D — 1} x { — J/2, . . . , J/2 — 1}, where 
D < K and J < L. Here, J is chosen even, and D and J are 
such that AK = K/D and Ai = i/J are integers. Note that 
we also allow the limiting case of full support in either or both 
dimensions, that is, D = K (i.e., AA' = 1) and/or J = L (i.e., 
AA = 1). Because of (ITOl i. the Hi k are then uniquely specified 
by their values on the subsampled time-frequency grid 

g = {{l,k) = {XAL,kAK) : X=0,...,J-1, 

K= 0,. ..,£>-!}. 

These subsampled values are given by 

D-l .J/2-1 
m=0 i=-J/2 

A = 0,...,J-1, K = 0,...,A'-1. (12) 

The time-frequency subsampling is desirable because it re- 
duces the dimensionality of the estimation problem, and thus 
tends to result in better estimation performance. 

Suppose now that pilot symbols ai^k — Pi,k are transmitted 
at time-frequency positions (/, k) G V, where V C G, i.e., 
the pilot position set V is a subset of the subsampled time- 
frequency grid Q. For mildly dispersive channels, the ISI 
and ICI are small. Then, at the pilot positions (I, k) G V, 
it is convenient to rewrite the system channel relation (|7]l 
as r;_fc = Hi kPi.k + zi,k, where all ISI and ICI are now 
subsumed by the noise/interference term zi,k- Based on this 
relation and the known pi,k, the receiver calculates channel 
coefficient estimates Hi^k at the pilot positions according to 

Hi^k = — = Hi,k + ^ , (l,k)eV. (13) 
Pi,k Pi.k 

The last expression shows that the Hi^k for (/, fc) G T' are 
known up to additive noise/interference terms zi,k/pi,k- A 
conventional channel estimator then uses some interpolation 
technique to calculate channel estimates Hi^k for all {l,k) 
from the Hi_k for {l,k) G V (e.g., [12-17]). In contrast, the 
proposed compressive channel estimator uses a CS recovery 
technique to obtain an estimate of A [to, i] and, in turn, of the 
Him- 

B. Some CS Fundamentals 

Before presenting the CS-based channel estimator, we need 
to review some CS fundamentals [1,2]. CS considers the 
sparse reconstruction problem of estimating an (approxi- 
mately) sparse vector x G C*^ from an observed vector of 
measurements y G based on the linear model ("measure- 
ment equation") 
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y = *x + z . (14) 

Here, $eC'^^^^ is a known measurement matrix and zeC*^ 
is an unknown vector that accounts for measurement noise and 
modeling errors. The reconstruction is subject to the constraint 
that X is (approximately) S-sparse, i.e., at most S of its entries 
are not (approximately) zero. The positions (indices) of the 
significantly nonzero entries of x are unknown. Typically, the 
number of variables to be estimated is much larger than the 
number of measurements, i.e., M':^ Q. Thus, $ is a fat matrix. 

We briefly review some CS recovery methods. Basis pursuit 
(BP) [44, 45] and orthogonal matching pursuit (OMP) [46] are 
probably the most popular ones. Whereas for BP theoretical 
performance guarantees are available, OMP lacks similar 
results. However, OMP allows a faster implementation, and 
simulation results even demonstrate a better performance. Low 
computational complexity is important since the channel has to 
be estimated in real time. CoSaMP [47] allows an even faster 
implementation than OMP. (Note that subspace pursuit [48] 
is a very similar method.) Using an efficient implementation 
of the pseudoinverse by means of the LSQR algorithm [49], 
we observed a run time that was only less than half that 
of OMP, and a performance that was only slightly poorer 
An advantage of CoSaMP is the availability of performance 
bounds. Hence, CoSaMP offers a good compromise between 
low complexity, good practical performance, and provable 
performance guarantees. 

The performance guarantees of BP and CoSaMP are phrased 
as an upper bound on the approximation error 1 1 x — x 1 1 2 , where 
X denotes the estimate of x. This bound is valid if the measure- 
ment matrix $ satisfies {1 — 5) < ||*x||2 < (l + <5) ||x||2 
for all ^-sparse vectors x G C^^, with some positive constant 
6. This is known as the restricted isometry property (RIP), and 
the smallest 5 is termed the restricted isometry constant 5s- 
For a small bound on ||x— x||2, 8s should be small. It has 
been shown [1,50,51] that if $ e C'^^^^ is constructed by 
selecting uniformly at random Q rowflfrom a unitary MxM 
matrix U and normalizing the columns (so that they have unit 
£2 norms), a sufficient condition for $ to satisfy the RIP with 
a restricted isometry constant that is bounded as < 7 with 
probability 1— 77 is provided by the following lower bound on 
the number of observations: 

Q > C7-2(lnA/)Vu^ln(l/?]). (15) 

Here, /zu — \/M maxij \Ui,j \ (known as the coherence of U) 
and C is a constant. 

Further CS recovery methods include thresholding [52], the 
stagewise OMP [53], the LARS method [54,55], the Lasso 
[56,57] (equivalent to BP denoising [57]), and Bayesian 
methods [58,59]. In [29,30], the Dantzig selector (DS) [60] 
was applied to sparse channel estimation. DS satisfies optimal 
asymptotic performance bounds when the noise vector z is 
modeled as random. However, for the practically relevant case 
of finite (moderate) Q and M, the performance of DS is not 
necessarily superior. In our experiments, we did not observe 
any performance or complexity advantages of DS over BP, 
OMP and CoSaMP 

'That is, all possible choices of Q rows are equally likely. 



C. Basic Compressive Channel Estimator 

We now combine pilot-assisted channel estimation with 
CS recovery. The central assumption of compressive channel 
estimation is that Sh[m,i\ is "compressible" [45] or approxi- 
mately 5-sparse, i.e., at most S values of Sh[m, i] (in the fun- 
damental i period {— i/2,...,L/2— l})are not approximately 
zero. This approximate "delay-Doppler sparsity" assumption 
will be further considered in Section |IV] Note that it implies 
that also F[m,i] = ^^^-^ 5h[m, i + qi] A; ,,(m, ^) is 
approximately ^-sparse. 

Our starting-point is the 2-D DFT relation (fT2]) . which can 
be written as the 2-D expansion 

D-l J/2-l 
m=0 i=-J/2 

with am.i — V JD F[m, i] and Um,i[^TiA — 

(l/\/jD)e-J27r(Km/D-A»/,/)^ j^^^ functions ' HxAL,kAK 

and Um^i[X,K] are defined for A = 0, . . . , J — 1 and 
K = 0, . . . , D — I and may thus be considered as J x D 
matrices. Define the vectors h = vecji^A al.kAA'} and 
u,„,i = vec{u,„ ,i[A, k] } of length JD by stacking all 
columns of these matrices (e.g., h = [hi ■ ■ ■ hjoY' with 
/ikj+a+1 = Hxal,kAk)- We can then rewrite (O as 

D-l J/2-1 

h = ^ ^ a^^tUjn.i = Ua, (17) 

m=0i=-J/2 

where a = vec{am,i} and U is the JD x JD matrix whose 
((i + J/2)D + m +l) th column is given by the vector Um,i. 
Because the ^ are orthonormal, U is a unitary matrix. 

According to Section IIII-AI there are pilot symbols 
at time-frequency positions {l,k) (zV. Thus, \V\ of the JD 
entries of h are given by the channel coefficients Hi k at the 
pilot positions {I, k) G V. Let h^P^ denote the corresponding 
length- IT' I subvector of h, and let U^p) denote the \'P\ x JD 
submatrix of U constituted by the corresponding [V] rows of 
U. Reducing ( fTTT i to the pilot positions, we obtain 

h(p) = U^P'a = *x, (18) 

with * = y^U(P) and x = y^a. Note that * is 
normalized such that its columns have unit £2 -norm, and that 
the length- JD vector x is, up to a constant factor, the vector 
form of F[m,i\. 

Our task is to estimate x based on relation ( fTSl ). The vector 
h'P^ is unknown, but we can approximate it by the corre- 
sponding vector of pilot-based channel coefficient estimates 
Hi k I ^^^^ (see (fT3Tl). For consistency with the notation used 
in Section IIII-BI this latter vector will be denoted as y (rather 
than h'P'). According to ( fTST l, y = h'P' + z, where z is the 
vector of noise/interference terms zi,k/pi,k\fj^ j^-^f^^y- Inserting 
(118b , we finally obtain the measurement equation 

y = #x + z . (19) 

The vector x is approximately ^-sparse because Sh[m,i] 
was assumed approximately 5-sparse. Thus, (fT9] l is seen to 
be a sparse reconstruction problem of the form ( fT4l i. with 
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dimensions M = dim{x} = JD and Q = dim{y} = \V\ 
and sparsity 5*. We can hence use one of the CS recovery 
techniques reviewed in Section UlI-BI to obtain an estimate of 
X or, equivalently, of a = 



/JD 



From the estimate F[m,i\ of F[m,i], estimates of all channel 
coefficients i// ^ are finally obtained via ( fTOl i. 

According to its definition <^ ~ \J ^ U^p', the measure- 
ment matrix 4> is constructed by selecting \V \ rows of the uni- 
tary JDxJD matrix U and normalizing the resulting columns. 
This agrees with the construction of $ described in Section 
IIII-Bl in the context of BP and CoSaMP. To be fully consistent 
with that construction, we have to select the \P\ rows of U 
uniformly at random. The indices of these rows equal the jT'l 
indices within the index range {!,..., J D} of the channel 
vector h that correspond to the set of pilot positions V. We 
conclude that the pilot positions {l,k)^V have to be selected 
uniformly at random within the subsampled time-frequency 
grid Q, in the sense that the \V\ "pilot indices" within the index 
range {!,..., JD} of h are selected uniformly at random. 

For BP and CoSaMP, in order to achieve a small upper 
bound on the reconstruction error ||x — xj|2 as discussed in 
Section IIII-BI the number of pilots should satisfy condition 
(fTSl l. In our case, this (sufficient) condition becomes 

\r\ > Cj-'{HJD)ysHi/7j), 

with an appropriately chosen 7 (note that i^ljj = 1). This 
bound suggests that the required number of pilots scales at 
most linearly with the delay-Doppler sparsity parameter S and 
poly-logarithmically with the system design parameters J and 
D. Note that the pilot positions are randomly chosen (and 
communicated to the receiver) before the beginning of data 
transmission; they are fixed during data transmission. 

IV. Delay-Doppler Sparsity and Leakage Effect 

In this section, we analyze the sparsity of the channel's 
delay-Doppler representation for a simple time-varying multi- 
path channel model comprising P specular (point) scatterers 
with fixed delays Tp and Doppler frequency shifts i/p for p = 
1, . . . ,P. This simple model is often a good approximation 
to real mobile radio channels [61,62]. The channel impulse 
response thus has the form 



p=i 



(20) 



where r/p characterizes the attenuation and initial phase of the 
pth propagation path and 5{-) is the Dirac delta. The discrete- 
delay-Doppler spreading function (|9]l then becomes 

P Nr-l 

p—l n— 



P 

p=l 



V T' 



i-i^nTM , (21) 



with 



where 



aM(x,2/) ^ 0('')(x)V(y), 



(22) 



sin(7ry) 
NrSm{7ry/Nr) 



(23) 



It is seen from (ISTT l that, although we assumed specular 
scattering, Sh[m,i] does not consist of Dirac-like functions 
at the delay-Doppler points of the scatterers, (Tp/Ts, VpTsNr). 
Rather, there occurs a leakage effect which is characterized 
by the function A^'^^{x,y) = (l)^'^\x)ilj{y), and which is 
stronger for a broader A^'^'>{x,y). The leakage effect is due 
to the finite transmit bandwidth (w 1/7^) and the finite block- 
length (Nr ~ LN). It is important for compressive channel 
estimation because it implies a poorer sparsity of Sh[m,i]. 
Note that whereas a large blocklength reduces the leakage 
effect, it also implies that the specular model with constant 
parameters ( l20b is a less accurate approximation and, thus, 
that the continuous-delay-Doppler spreading function [42] is 
less sparse. This motivates an extension of the compressive 
channel estimation method that is able to reduce the leakage 
effect (see Section [VTi. 

In view of d^Tt . studying the sparsity of Sh[m, i] essentially 
amounts to studying the sparsity of A'-'^p^m — Tp/Ts,i — 
UpTsNr) = (ji^^p^m - Tp/Ts) - Vp%Nr). To this end, we 
first consider the energy of those samples of 0'^'^^'' (m — Tp/T^) 
whose distance from Tp/T^ is greater than Am € {1, 2, . . . }, 
i.e., |m — Tp/Ts\ > Am. We assume that (j)^'^'>{x) exhibits at 
least a polynomial decay, i.e., < C(l + jx/xol) " 

with s > 1, for some positive constants C and xq. This 
includes the following important special cases: (i) the ideal 
lowpass filter, i.e., fi{t) ^ /2(<) = sinc(i/Ts) with 

sinc(a:) = £Hl(zi£l^ here s = 1; and (ii) the family of root- 
raised-cosine filters: if both fi{t) and f2{t) are equal to the 
root-raised-cosine filter with roll-off factor p, then, for v not 
too large, (/)''^'(.t) w sinc(a;) cos(p7ra;)/[l— (2/92-)^] and s — 3. 
Based on the polynomial-decay assumption, one can show the 
following bound [23] on the energy of all <j)'^'^p'>{m — Tp/Tg) 
with \m - Tp/Ts\ > Am: 



E 

Tp/Ts|>Am 



< 



2s -1 



Am-l\-2s+i 



xa 



Hence, the energy of <j)'^'^p'> (m — Tp/Tg) outside the interval 
[ [Tp/Tg — AmJ , \Tp/ Ts + Am] ] decays polynomially of order 
2s — 1 with respect to Am. 

In a similar manner, we consider the energy of those 
samples of — VpTgNr) whose distance (up to the modulo- 
Nr operation, see below) from VpTgNr is greater than Ai S 
{2, . . . , [7Vr/2j }. Let I denote the set {0, . . . , A^r-1} with the 
exception of all i — iivaodNr, where iz is any integer with 
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\iz — VpTsNr\ < Ai. From ( l23T l. one can obtain the bound [22] 

which shows that the energy of — VpTsN^) outside the 
interval [[vp%Nr- /^i\,\vp%Nr + /^i]] (modulo A^^) decays 
linearly (polynomially of order 1) with respect to Ai. 

From these decay results, it follows that A^'^p' (m— i— 
VpTsNr) = 0(''p)(m - Tp/Ts)ip{i - Vp%Nr) can be con- 
sidered as an approximately sparse (or compressible, in CS 
terminology [45]) function. Thus, as an approximation, we 
can model K''^p\m — Tp/Tg, i — VpTgNr) as A^A-sparse, with 
an appropriately chosen sparsity parameter N\. It then follows 
from (ISTT i that Sh[m,i] is PA^A-sparse, and the same is true 
for F[m,i] in (fTTT l. Unfortunately, A'a cannot be chosen 
extremely small because of the strong leakage that is due to 
the slowly (only linearly) decaying factor il;{i~i'pTsNr). This 
limitation motivates the introduction of a sparsity-enhancing 
basis expansion in the next section. 

V. Sparsity-Enhancing Basis Expansion 

The 2-D DFT relation ST2\ underlying the basic compressive 

channel estimator is an expansion of the subsampled channel 

coefficients Hxal.kAK into the 2-D DFT basis Um,i[^^'^] = 
(l/^/JD)e-J27r(Km/D-A^/J) (ggg j^^^ sparsity of the 

expansion coefficients am.i — VJD F[m, i] was shown above 
to be limited by the slowly (only linearly) decaying function 
■0(z — VpTsNr). In order to enhance the sparsity, we now 
introduce a generalized 2-D expansion of Hxal.kAK into 
orthonormal basis functions Vm.i[^, 

D-l J/2-1 
m=0 i=-J/2 

A = 0,...,J-1, K = 0,...,D-1. (24) 

Clearly, our previous 2-D DFT expansion JTSb . ( ITSI l is a special 
case of (|24] |. 

A. 1-D and 2-D Basis Expansions 

We will choose a basis {wm.i[-^, '«]} that is adapted to the 
channel model (|20] | (but not to the specific channel parameters 
P, Vp^ ™d i/p in (|20]|). Equation (l20l i suggests that the 
coefficients f3,n,i should be sparse for the elementary single- 
scatterer channel h'-^^'^^^t, r) = (5(t-ti) eJ'^'"'^*, for all n e 
[0,T,„ax] and z^i e [-I'max, t-maj- SpeciaHziug (|2B to P = 1 
and = 1, and using (fTTT l. the 2-D DFT expansion ([T2T i 
yields after a straightforward calculation 

D-l 

HxAL.nAK - ^</.(''^)(m-^)c(''i)[m,A]e-^'2-^. 

m=0 

(25) 

Here, we have set 

C(-)KA] ^ 'f]'a^:-)4=e^-2'^^, (26) 

i=-J/2 



where 

- Vjj:i.^^^H^ + qL] A;Jm:^) , ill) 

with ^ eJ^(^iTs-»/JV.)(Af.-i) -i/iT.iV^). 

According to ( |27] |. the poor decay of ^"(2;) entails a poor 
decay of a'^l with respect to i. To improve the decay, we 
replace the 1-D DFT ( |26] | by a general 1-D basis expansion 

j/2-1 

4=-, 7/2 

m = 0,...,£'-l, A = 0, ...,J-1, (28) 

with a family of bases {bm.i[^]}i=-j/2,....j/2-v 
m = 0, — 1 that are orthonormal (i.e., 

J2i=ob7n,iA^]b*^ iAX] = 6[ii - 12] for all m) and do 
not depend on the value of vi in C^'^^'^[m, A]. The idea is to 
choose the 1-D bases {^m,i[A]}j^_jy2 J^2-i ^'^''^ '■^^ 
coefficient vector [/3f,^^l//2 ' ' ' i*^™ j/2-1] sparse for all 
m and all vi E [— i^max, I'max]- Substituting (l28l l back into 
(|25] |. we obtain 

m=0i=-J/2 

X 6™,. [A] 

This can now be identified with the 2-D basis expansion f2M . 
with the orthonormal 2-D basis 

[A, ^ ^ b,n,^ [A] £-■'■2-^ (29) 

VD 

and the 2-D coefficients p'^'^A'^ = VD (jj^^^^m - 
t) ^m^i ■ basis functions Wm^i[A, k] are seen to agree 
with our previous 2-D DFT basis functions u,„_i[A,K] = 
(l/VJi^)e--'27r(Km/_D-AV./) ^ith rcspcct to K, but they are 
different with respect to A because e^^'^^'^'^ is re- 

placed by fom,i[A]- Furthermore, the sparsity of P^^^A^'^ 
the i direction is governed by the new 1-D coefficients 
i^m^i' which are potentially sparser than the previous 1-D 
coefficients a^'^^j in ( 1261 ) that were based on the 1-D DFT 
basis {(l/VJ)e^27rA'/,/|^ 

These considerations can be immediately extended to the 
multiple-scatterer case. When the channel comprises P scat- 
terers as in (|20] |. the coefficients are f3r,i,i — J2p=i Vp Pm^'f'''^- 
If each coefficient sequence f^l^''^"''^ is S'-sparse, /3m,i is PS- 
sparse. Note that, by construction, our basis {vm,i[^' '^]} '^'^^^ 
not depend on the channel parameters P, ijp, Tp, and fp, and 
its formulation is not explicitly based on the channel model 
( l20t . The use of the generalized 2-D basis {vm,i[^^ '^]} < l29] l 
comes at the cost of an increased computational complexity, 
because efficient FFT algorithms can only be applied with 
respect to k but not with respect to A. However, if J is not 
too large, the additional complexity is small. Optimal designs 

of the 1-D bases {bm,i[^]}i=-j/2 j/2-1 ^^^^ presented 
in Section |Vl] 
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B. Generalized Compressive Channel Estimator 

A CS-based channel estimation scheme that uses the gen- 
eralized basis expansion ( l24l i can be developed similarly as 
in Section Hira We can write (EDi as (cf. ^) h = V/3, 
with a unitary matrix V. Here, /3 and V are defined in an 
analogous manner as, respectively, a and U were defined 
in Section IIII-CI Reducing this relation to the pilot positions 
yields (cf. ^) h^P) = V(p)/3 = *x, with * ^ V^p'D and 
X = D~^/3, where the diagonal matrix D is chosen such that 
all columns of $ have unit €2 -norm. Finally, we replace the 
unknown vector h'P^ by its pilot-based estimate, again denoted 
as y. Using ( 113b . we then obtain the measurement equation 
(cf. (fT9] l) y = ^^x + z, where z is again the vector with entries 
5i,fc/pi,fc|(j ^.•jg.p- As in Section UlI-CI our task is to recover 
the length- JI? vector x from the known length- jT'l vector 
y, based on the measurement equation. From the resulting 
estimate of x, estimates of the channel coefficients Hi k on 
the subsampled grid Q are obtained via (l24t by means of the 
equivalence of /3,„_i and (3 = Dx. Inverting^ (fT2li and applying 
([Tol l then yields estimates of all channel coefficients Hi^k- As 
discussed further above, we can expect /3 and, in turn, x to 
be approximately sparse provided the 1-D bases {&m,z[A]} are 
chosen appropriately. Hence, our channel estimation problem 
is again recognized to be a sparse reconstruction problem of 
the form (fT4l i. with dimensions M = dim{x} = JD and 
Q = dim{y} = |7^|. We can thus use a CS recovery technique 
to obtain an estimate of x. 

For consistency with the CS framework of Section IIII-CI 
we select the pilot positions uniformly at random within the 
subsampled time-frequency grid Q. For BP and CoSaMP, to 
achieve a small upper bound on the reconstruction error, the 
number of pilots should satisfy condition (fTSl l. i.e., 

\V\ > C7~'(ln(Ji?))Vv^ln(l/r/), 

where 5* is the sparsity of x and /^v is the coherence of 
V. Note that S depends on the chosen basis {wm.i[A, k]}; 
furthermore, /^v > 1 (for the DFT basis, we had fijj ~ 1). 
Thus, the performance gain due to the better sparsity may be 
reduced to a certain extent because of the larger coherence. 

VI. Basis Optimization 
We now discuss the optimal design of the 1-D bases 

A. Basis Optimization Framework 

The orthonormal 1-D bases {bm.i[M}i=- j/2 7/2-1' ~ 
0, . . . , D — 1 should be such that the coefficient vectors 

[^m^^ j/2' ' ' ''^m J/2-1] sparse for all m and all v e 

] (the maximum Doppler frequency shift 

^max 

is assumed known). For our optimization, we slightly re- 
lax this requirement in that we only require a sparse co- 
efficient vector for a finite number of uniformly spaced 

-Note that the 1-D part of j24t corresponding to index m equals 
the respective 1-D part of I12t (I-D DFT), since Dm i[A, ft] = 
(1/VD) 6m,i [A] e--'2'rK"i/D Hence, the transformation and the in- 
verted transformation JI2t have to be applied only with respect to the index 
i. 



Doppler frequencies e 2?, where V = {i^Ad, d = 

- [I'max/i^Al , • • • , [t'max/ l^A^ } with somc Doppler frequency 
spacing i^a- 

Regarding the choice of i^a, it is interesting to 
note that for the "canonical spacing" given by i'a = 
l/{TsNr), the coefficients a'-J^'^^'^^ in the 1-D DFT ex- 
pansion ( l26l l are 1-sparse with respect to i. Indeed, 
= eJ'^(''i^=-'/^'-)(^--i)iA(i-i/iTs7Vr) here simplifies 
to V'^'^'^'H = eJ'^(''-*)(^'-i)/^'- = f^ArJi-d], where 
(5jv^[i] is the iVr-periodic unit sample (i.e., SN^ii] is 1 if i is a 
multiple of A^^ and otherwise). Expression dZTb then reduces 
to 

= Vj''^S^A^-d + qL]A*Lr,l±lIi) 

q=0 ^ 

= SN^[i-d]A;^g(^m,^^ , 

where d depends on d but not on i. Thus, for i/a = l/(7s^r), 
the coefficients obtained using the 1-D DFT basis {6„i_4A] = 
(1/a/J) eJ^'"^^/'-'} are 1-sparse (no leakage effect). This means 
that the 1-D DFT basis would be optimal; no other basis could 
do better We therefore choose a Doppler spacing that is twice 
as dense, i.e., i^a = l/(2'7^A^r)- That is, we define T) such 
that it includes also the Doppler frequencies located midway 
between any two adjacent canonical sampling points. For these 
frequencies — given by fAd for odd d — the leakage (obtained 
with the DFT basis) is maximal. 

Because the basis {6m.i[A]} is orthonormal, the expan- 
sion coefficients P^^J^ defined by ( |28] ) can be calculated as 

the inner products ^^^^ ^ Etl C(-^^[mA]b*^^M z = 

— J/2, . . . , J/2— 1. This can be rewritten as 

0^"'' - B c('') 

with the length- J vectors yS'^' = [/3.m.''_,7/2 ' ' ' ^m,j/2-i] ^^"'^ 
cl'^^ = [C''"^ [m, 0] • • • C^"^) [m, J - 1]] ^ and the unitary Jx J 
matrix B„i with entries (B„i)i+i,A+i = ^m i_//2[A]- We can 
now state the basis optimization problem as follows. For given 
vectors Cm\ m = 0, . . . , I?— 1, with cin' defined as described 
above, find unitary JxJ matrices B,„ not dependent on v such 
that the vectors ft^^ = BmCm^ are maximally sparse for all 

For the sake of algorithmic simplicity, we will measure the 
sparsity of /3,„ by the ^i-norm or, more precisely, by the 
^i-norm averaged over all v e V, i.e., j^J^uev ll'^m'lli = 
Sjye-D ||BmCm^ II J. Thus, our basis optimization problem 
is formulated as the D constrained minimization problem^ 

B„, = argmin ^ ||B™cM||^, m = 0,...,D-l, (30) 

' We note that the optimization problem )30t is similar to dictionary 
learning problems that have recently been considered in [34-36]. In [36], 
conditions for the local identifiability of orthonormal bases by means of 
£1 minimization have been derived. An ^o-norm based sparsity-enhancing 
basis design has been proposed in the MIMO context in [63]. Furthermore, 
basis adaptation and selection at the receiver has been considered in the 
ultrawideband context in [64]. 



g 



where U denotes the set of all unitary J x J matri- 
ces. Note that the vectors Cm'' are known because they 
follow from the function C^'^'^[m,X], which is given by 
(see m, m) C^KA] = Efi-j'/aEr^o'^^"^!* + 
qL] A* g{m, eJ2'^^*/'^. It is seen that the optimal bases 

characterized by the matrices B,„ depend on N, L, J, g[n], 
7[n], and (via the definition of 2?) t'inax, but not on any other 
channel properties. 

For classical CP-OFDM with CP length N-K > D-1, 
we have A^_g(m,^) = Aj^g{0,£^) for all m = 1, . . . , D—1, 



so 



see (l2t)D . (|27] |) and thus Cm'' = 
Cq''''. Because Cm-* no longer depends on m, only one basis B 
(instead of D different bases B™, m = 0, . . . , D — 1) has to 
be optimized. 



B. Statistical Basis Optimization 

The basis optimization framework presented above can be 
extended to take into account prior statistical information 
about the channel. Let us again consider the single-scatterer 
channel h^^^-'''^''^^\t,T) = ryi (5(r-Ti) e^^'^^'i*, now including 
a path gain 771. We assume that ti, vi, and 771 are random, 
with {ti,vi) distributed according to a known probability 
density function (pdf) p{ti,vi), and 771 given {ti,vi) be- 
ing zero-mean, circularly symmetric complex Gaussian with 
known variance (7^(ri,i^i). As before, we consider a 2-D 
expansion of the subsampled channel coefficients H\al,kAK 
into (deterministic) orthonormal basis functions Vm^i[X, k], 

i.e., HxAL,kAK = Z]m=0 Sj'i- ,7/2 ['^' '^l' ^ = 

0, . . . , J - 1, K = 0, 1. Clearly, the vector /3 of 

expansion coefficients /?,„,i (which is defined as in Section 
IV-Bb now is a random vector. Our goal is to find basis 
functions v„i^i[X,K] (or, equivalently, a unitary matrix V, 
defined as in Section IV-Bb such that j3 = /3(V) is maximally 
sparse on average. Measuring the sparsity of (3 by the £i-norm 
for convenience, we obtain the optimization problem 



V = argmin E{||/3(V)|U 



(31) 



where E{ } denotes expectation and W denotes the set of all 
unitary JD x JD matrices. 

Again, we set v^,,[X,k] ^ (1//D) 6™.,[A] e-^^— /i? 
with a family of orthonormal 1-D bases {&,„,i[^]}. Then, 
dSTT i reduces to the minimization of E{||/3||j^} with respect 
to {6m,i[A]}. For the single-scatterer channel, the £i-norm of 
13 = i^lrui'i.m) can be shown to be 



(ti, 1^1,171) I 



J/2-1 

X E 



.1-1 



Y,C^^^\m,X]bUX] 



with C'^''i)[77z, A] as in (|26ll, (|27]i. We note that \'qi\ given 
(ti,i/i) is Rayleigh distributed with mean a{Ti, Vi)\Jti/2. 
Hence, E{ ||/3(^i''^i'''i) |U is given by (hereafter, we write 



T, rj instead of ti, t^i, 771) 



D-1 



J 12-1 
X E 



J-1 



Y^C^''\m,X]bl^^.^[X] 



A=0 



dv , 
(32) 



with 



1 ^ 

771 = 



(t(t, v) (j)^^^ (m — — ) p{t, v) dr > 
\ If, / 



It follows that minimizing ( |32] | with respect to {&m,i[A]} 
amounts to minimizing 



,7/2-1 

E 

i=-J/2 



J-1 

E 

A=0 



C('^)[7n,A]5;,JA] 



G'^''^[m]diy (33) 



for all 771 = 0, ... , D — 1. Note that G''^-' [777] can be computed 
from the known statistics. In 



vector-matrix notation, with 



and the unitary JxJ 

matrix B,„ with entries {Bm)z+i,\+i - ^m.i-j/21-^]' "^"i" 
mization of ( l33b can be equivalently written as minimization 
of 

(34) 



|B„,cM|Lg('')H diy 



over the set U of all unitary JxJ matrices B„i, 
for 771 = 0, . . . , D — 1. Approximating this integral by 
its Riemannian su over the set V = {i^Ad, d = 

- [I'max/l^Al , • • ■ , \l^ma^/'^A^ } with l^A = 1/ {2TsNr), for a 

given maximum Doppler frequency I'max, the minimization 
problem can be finally stated as 



B, 



arg mm 



IB c'")! 



With cL") ^ cL^^G^'') 



777 



(35) 

for 777 = 0, 1. This is recognized to be of the same 

form as ( l30l l. 

In practice, the channel statistics pij^v), (j'^{t,v) will 
deviate from the true statistics to some extent, so that the 
basis matrices B„i obtained as described above will be dif- 
ferent from the truly optimal ones. An interesting question 
is as to how this difference affects the average sparsity of 
the expansion coefficient vector /3(^''''''l For simplicity, we 
measure the average sparsity by E{|j/9||j^}, and we assume 
that the optimization criterion is minimization of ( l34l i (which, 
after all, is almost equivalent to ( l35l l) and, further, that AL = 1 
or equivalently J = L (i.e., no subsampling with respect to 0- 
Let P and /3 denote the expansion coefficient vectors obtained 
for the true and incorrect bases, respectively. Then, one can 
show the following bound on the normalized difference of the 
average sparsities of /3 and (3: 

|E{||^||J-E{i|/3||J| 
E{||/3|li} 

^Alternatively, the integral can be inteipreted as an expectation with respect 
to u and computed by means of Monte Carlo techniques. This is especially 
advantageous if the maximum Doppler frequency is unknown. 
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J — oo Z^m=0 




\A^^g{m,i^Ts)\ 













< 2Vl 



where G'^"-' [to] is defined analogously to G^'^-' [to] but with the 
incorrect statistics. 



C. Basis Optimization Algorithm 

Because the minimization problems ( l30l l and (l35T l are 
nonconvex (since U is not a convex set), standard convex 
optimization techniques cannot be used. We therefore propose 
an approximate iterative algorithm that relies on the following 
facts [65]. (i) Every unitary JxJ matrix B can be represented 
in terms of a Hermitian JxJ matrix A as B = e-'^. (ii) 
The matrix exponential B ~ can be approximated by its 
first-order Taylor expansion, i.e., B w 1/ + jA, where Ij 
is the JxJ identity matrix. Even though B is unitary and 
Ij + jA is not, this approximation will be good if ||A||^ is 
small, where || A||^ denotes the largest modulus of all entries 
of A. Because of this condition, we construct B^ iteratively: 
starting with the DFT basis, we perform a small update at 
each iteration, using the approximation B « Ij + jA in the 
optimization criterion but not for actually updating B™ (thus, 
the iterated B^ is always unitary). More specifically, at the 
rth iteration, we consider the following update of the unitary 



matrix B 



(r). 



bw 



(r) 

where Am is a small Hermitian matrix that remains to be 
optimized. Note that Bm^^-* is again unitary because both 'B^n 
and e^^™' are unitary. 

Ideally, we would like to optimize A„ 
(l30]l (or (|35]l), i.e., by minimizing X],.ex) ii"'" m 
Yliv£V lle"'^"' Bto' Cm'' II Since this problem is still noncon- 
vex, we use the approximation « Ij + jA, and thus the 
final minimization problem at the rth iteration is 



according to 



|R(r+l)„(«^)| 



AS""' = 



arg mm 



|(Ij+jA)bMc 



(36) 



Here, Ar is the set of all Hermitian JxJ matrices A that are 
small in the sense that ||A||^ < p,., where p,, is a positive 
constraint level (a small pr ensures a good accuracy of our 
approximation B « Ij + jA and also that e^^™ is close to 
Ij). The problem (l36l l is convex and thus can be solved by 
standard convex optimization techniques [66]. 

The next step at the rth iteration is to test whether the cost 
function is smaller for the new unitary matrix 



'B^rli.e., 



1 <E. 



In 



whether Y^^ev 

the positive case, we actually perform the update of Bm and 
we retain the constraint level pr for the next iteration, i.e., 

^rn ^ "m. i 



Pr+1 — Pr 



Otherwise, we reject the update of B 
constraint level p^, i.e.. 



(r) 



Pr+1 



and reduce the 

Pr_ 

2 ' 



By this construction, the cost function sequence Yvi^v ||-^ 



(r) 



C 

ing 



, r = 0, 1, . . . is guaranteed to be monotonically decreas- 



The above iteration process is terminated if pr falls below 
a prescribed threshold or if the number of iterations exceeds 
a certain value. The iteration process is initialized by the Jx 
J DFT matrix Fj, i.e., Bm'' = Fj, because the DFT basis 
was seen in Section |IV] to yield a relatively sparse coefficient 
vector We note that efficient algorithms for computing the 
matrix exponentials e^^"^ exist [65]. Since the bases {6m,i['^]} 
(or, equivalently, the basis matrices Bm) do not depend on the 
received signal, they have to be optimized only once before 
the actual channel estimation starts. 

In Fig. [1] we compare the expansion coefficients a^.i 
obtained with the DFT basis (see ( fT6] l) and /3m, i obtained with 
the deterministically optimized basis (see ( l24b . ( |29] l) for one 
channel realization. The system parameters are as in Sections 
IVIII-AI and IVIII-BI (first scenario). For the minimization ( |36] | 
(not TO-dependent, since we consider a CP-OFDM system), 
we used the convex optimization package CVX [67]. It is seen 
that the basis optimization yields a significant enhancement of 
sparsity. 

VII. Channel Estimation for Strongly Dispersive 
Channels 

For strongly dispersive channels, the off-diagonal 
system channel coefficients (ISI/ICI coefficients) 
{Hi,kd'.k'}{i,k)^{i',k') in O are no longer negligible. 
Therefore, we now present a compressive channel estimator 
that is able to produce reliable estimates of all channel 
coefficients Hi^k\i',k'- 

A. Basis Expansion Model 

The proposed channel estimator uses a basis expansion 
model [18-20] that is different from the basis expansion 
considered in Sections |V] and |VT] The discrete-time channel 
impulse response h[n, to] is expanded with respect to n into 
orthonormal basis functions V'j['^]' * = 0, . . . , A^r^ 1, i-c. 



h[n, to] 



E 

i=0 



0, 



with TO-dependent expansion coefficients 

Th[m,i] = h[n,m] ii*[n] 



,iVr"l, (37) 



(38) 



The function Th[m^i] generalizes the discrete-delay-Doppler 
spreading function Sh[m,i\ in (|9]l, which is reobtained for 
i/)j[n] = e^^'"'"/^" (up to a constant factor). Sim- 

ilarly to ([8]), the discrete-time channel can now be rewritten 

as 

Th[m,i]s[n — m\'4'i[n] + z[n] , 



00 

E 



-00 'i— 



0, 



,Nr-l. (39) 



We assume that the support of Tft,[TO, i] is contained in 
{0, . . . , _D — 1} X {0, . . . , J — 1} (/i[n, to] is assumed causal 
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(a) (b) 

Fig. 1. Sparsity enhancement obtained with the proposed iterative basis optimization algorithm: Modulus of the expansion coefficients for (a) the DFT basis 
and (b) the optimized basis. 



with maximum delay at most D—1). Combining Q, ( |39] ). and 
([T]i, we then reobtain the system channel relation (|7]i, with the 
channel coefficients Hi^k:i',k' expressed as 

D-1 J-l 



m=0 i=0 



n\ e 



n— — oo 



X g[n-m-{l'-l)N]tl;i[n + lN] 



(40) 



Note that the limiting cases D — K and J = Nr are also 
allowed. 

B. Compressive Channel Estimator 

The proposed compressive channel estimator operates in 
an iterative, decision-directed fashion. At the first iteration, it 
utilizes the knowledge of some pilots pi,k^A with (Z, k)£'P. 
The pilot position set V is selected uniformly at random within 
{0, . . . , L—1} X {0, ... , K—1}. At later iterations, the estimator 
additionally uses virtual pilots, which are based on the symbol 
decisions produced by a suitable ISI/ICI equalizer (e.g., [40, 
68-71]) followed by the quantizer Typically, the equalizer will 
use the (estimated) channel coefficients Hi^k;i',k' only within 
a certain "off-diagonal bandwidth," i.e., for — < /max and 
|fc — fc'l < fcmax (modulo K). 

At the rth iteration, let p'f^ denote "extended pilots" (pilots 
augmented by virtual pilots) on an extended pilot position set 
V^''\ This set is defined as V^''^ = U^^) Q)V = {{l,k) = {li + 
l2,{ki + k2)modK) : (?2,fc2)eV}, where 



V ^ {(/,fc) : I = -In 



,1. 



k 



and 'H^'"^ will be specified later. Note that by this construction 
for an extended pilot in "H'' ', all neighboring symbols (which 
yield the largest interference) are also included in 'P^'^\ Then, 
for {l,k) e n^''\ relation O can be written as 

(l',k') G {(i,fc)}eV 

(41) 

where the noise/interference term includes noise, ISI/ICI 
from outside the set {{I, k)} ©V, and — possibly — some addi- 
tional errors if p], j^, aj/ fc/. If V is chosen sufficiently large. 



the ISI/ICI part in z}'"^ is negligible. Inserting (gOll into (HTT i 
yields the noisy 2-D expansion 



D-l J-l 

ri,k X! X! ^"'-^ ' 

r?j.=0 1=0 



^^)j/,fc] + z[t (/,fc)eHW, (42) 



with 0.,n^,^n[m,i] and «^™^J/, fc] e {(;.fc)}ffiV Ppk' 

l)N]i:,[n + lN]] e-J27rfe'm/x^ Differently from ([Tg and dal, 
this is an expansion of the demodulated symbols r; ^ and not of 
the channel coefficients Hij^. Note also that the basis functions 
w'^\[l, k] depend on the extended pilots (/, k)€V'^^\ 

Using a stacking as in Section IIII-CI the expansion (l42T l 
can be expressed as r*^''^ = + z'''\ where the 

dimensional vectors r^*"^ and the JZ?-dimensional vector 
9, and the \'H^''^\ x JD matrix W''') are defined in an 
analogous manner as, respectively, h'p', z, a, and U'p-' in 
Section Hira With y('') = r^''), = W^'^^D^''), and 

^(r) A (J)ir)^-i0^ where the diagonal matrix D^*"^ is chosen 
such that all columns of ^^''^ have unit ^2 -norm, we obtaiijfjthe 
measurement equation (cf. ( fT9] l) y'''-' = $('')x'''^ +z'''^. As in 
Section UlI-CI we would like to recover the length- JZ? vector 
x'*"^ from the known length- jH*^ ''-'I vector y('"l If the basis 
functions in ( |37] | and ( l38T l are chosen such that Th[m, i] 
(or, equivalently, 6) is sparse, then also x'""' = (D'''-')~^0 is 
sparse. Hence, our problem is again a sparse reconstruction 
problem of the form (fT4] i. with dimensions M = dim{x'''^} = 
JD and Q = dimfyf''} = {H^'^^. We can thus use a CS 
recovery techniquqj to obtain an estimate x'''^ of x^' ' and, in 
turn, an estimate 0^ "* = D'^^'^x'''^ or, equivalently, Tj^^^rrLji]. 

From Tl^^\m,i], estimates of the channel coefficients 
Hi,k;i'.k' for all l,V = 0,...,L-l and fc, fc' = 0, . . . , 7^- 1 
are obtained via ( l40l l. Then, an ISI/ICI equalizer yields symbol 

(r) 

estimates a] and, subsequently, a quantizer produces detected 

^The computation of the measurement matrix essentially requires 
i(2/nmx + l)(2fcmax + 1)J FFTs of length K. Note that J is typically 
very small, cf. Section fVlI-CI 

''Whether ^('') satisfies the RIP with a small restricted isometry constant 
depends on the basis functions -ipi [n] as well as on the extended pilot posidon 
set "Pt' ); hence, performance guarantees cannot be made in general. 



11 



symbols aj^, I = 0, . . . , L-l, k = 0, . . . , K-1. On V, these 

are replaced by the known pilots, i.e., we set d'f^ = pi,k for 
{l,k)eV. 

Next, we determine as the largest subset of 

{0, . . . , L — 1} X {0, ...,/<— 1} such that the new extended 
pilot set -^('■+1) © V contains only "reUable" 

detected symbols alf'l, and we define the new extended pilots 

as p^l,^^^ = a| fe for {l,k) e V'^^'+^l Here, following [71], a 

(r) 

detected symbol d] {. will be considered as "reliable" either 
if (/, k) G V or, for (/, k) ^ V, if the corresponding symbol 
estimate a|'2 (result of equalization, before quantization) is 

' (r) 

significantly closer to d^ than to any other symbol in A. For 
example, for the QPSK alphabet A = {1+j, -l+j, -l-j}, 
d|'2 will be considered as reliable either if {l,k) <eV or if 

both |3fJ{a|'2}| > e and |5{a|'2}| > e for a certain threshold 
e>0. ' 

Proceeding iteratively in this fashion, we successively con- 
struct extended pilots p'fl, which are used to estimate [m, i] 
and, via ( |40| ). the channel coefficients Hi^k:i',k' ■ The reliability 
criterion ensures that most of the extended pilots equal the 
true transmitted symbols. Since the p'f^ are improved with 
the iterations, we expect ["H^'+^'l > \'H^'^^\ in general. The 
iterative algorithm is initialized with pf'l — pi^k and T''"' ~ 
= -p (for r = 0, V = {0}, whereas later V = {il,k):l = 
-'max, ■ • ■ , /max; k = -fcmax, • • ■ , fcmax})- Accordingly, we 
use the conventional one-tap equalizer (without ISI/ICI equal- 
ization) at the first iteration. The algorithm is terminated either 
if the difference between H['j^^\, and i^/'J;/^,/ (measured by 
a suitable norm) falls below a certain threshold or after a 
fixed number of iterations. While a proof of convergence for 
this iterative algorithm is not available, we always observed 
convergence for reasonably chosen ijji [n] (see Section IVII-Cl l. 
[Pi and e. 

The proposed algorithm is not limited to strongly dispersive 
channels. For weakly dispersive channels, we simply set V = 
{0} at all iterations and replace the ISI/ICI equalizer by 
the conventional one-tap equalizer. This effectively amounts 
to a decision-directed, iterative extension of the compressive 
channel estimator discussed in Sections HIIUVII This extension 
can improve the estimation accuracy. Moreover, it can increase 
the spectral efficiency of the system even further, since the 
pilot set V can be chosen quite small due to the successive 
improvements achieved by the iterations. However, these gains 
come at the cost of some additional complexity. 

C. Sparsity-Inducing Basis Functions 

The basis functions V'li"-]' * = . . . ^Nr — 'l have to be 
chosen such that the generalized spreading function Th[m, i] in 
(l38T l is sparse. In particular, ( l20l i suggests that Th[m, i] should 
be sparse for the single-scatterer channel h^'^^^'^'-^t, t) ~ 5{t— 
n) e-'^'^'^i*, for all n e [0, Tmax] and vi G [-I'max, I'max]- For 
this channel. 



Nr-l 

with §^"^[1] = (43) 

n=0 

The factor ip^'^^^irn — Ti/Tg) (see ( l22l i) is already sparse due 
to its fast decay as discussed in Section |IV] Thus, we have to 
design the ij)i[n] such that the factor is sparse for all 

^ ^ [ ^max5 ^max]- 

For this purpose, we can adapt the basis optimization of Sec- 
tion [Vll Let V = {v^d, d = -[fmax/l'Al , • ■ • , [l^max/l^Al } 

with v/\ = l/{2%Nr) and rewrite the second equa- 
tion in (|43]) as i?*^") = Pe*"), with the length- A^,, vec- 
tors tJf"^) = [z?M[0] ••• i9('')[iV,. - l]]"^ and e^"^) = 
^j2^v% , , , ^j2^v(N^-i)%Y unitary Nr x 

Nj. matrix P with entries (P)j+i „_|_i = "0 ^P" 
timal basis functions are now defined as P = 

argminpgi^ X^i^eP l|P^'"''''lli' ^o that the iterative optimiza- 
tion algorithm of Section lVI-Cl can be used. However, for large 
Nr ~ NL, the computational cost of this approach is quite 
high. 

As a practical alternative, we propose a construction of 
the il)i\n\ that involves discrete prolate spheroidal sequences 
(DPSSs) [37]. Basis expansion models using DPSSs have been 
considered previously [20]. If their design parameters are cho- 
sen according to maximum Doppler frequency i^max, sampling 
period %, and blocklength iV^, the corresponding functions 
■d^^"^ [i] in ( l43l l will have an effective support {0, . . . , J— 1} for 
all V £ [— i^max, i^max], whcrc J is Small compared with A^,.- 
Unfortunately, within this support interval, the 'd'^^ [i] are not 
sparse in general. 

We will therefore use a specific combination of DPSSs and 
DFT basis functions, which yields functions '(^(''^[i] that are 
still effectively zero outside {0,...,J— 1} but, within that 
interval, preserve the sparsity obtained with the DFT basis. Let 
-0,1''^ [n], ??, e Z, i = 0, . . . , - 1 denote the DPSSs that are 
bandlimited to [— i^maxTs, J^max^^] and have maximum energy 
concentration in {0, . . . , A^r — 1} [37]. In what follows, the 
DPSSs Vl''^ N will be truncated to {0, . . . , iV,.-l}. Then, for 
large TV^, the support of d'^^i] = E^=o ^eJ^'^''"^^ [n] 
is effectively contained in {0,...,J — 1} for all v G 
[-fmax, I'max], where J = 2Jo+Ji with Jo = [j/max^siVrJ and 
Ji > 2 a small integer. In addition, we consider the 2 Jq+I or- 
thonormal DFT basis functions tpf\n] = (I/VN^) e^^™'/^-, 
n = 0, . . . , A'^r ~ 1, for i = —Joi ■ ■ ■ ,Jq- For these i, 
'^I*] ^ i/{NrTs) is in [— i^max, ^^max]- We thus have for all 
ii = — Jo, ■ ■ ■ ,Jo and 12 = J, . . . , A^r — 1 

^ ^ n— 

«0, (44) 

because € [-fmax, J^max] but 12 ^ {0,..., J-1}. That 
is, ipf_^ and are effectively orthogonal for the specified 
ranges of ii and 12- Let us now define the following ordered 
set of (in total A'^^) DFT functions and (truncated) DPSSs; 

jvi — , ipjg , '/^2Jo+l' ■ • ■ ' v^Af^-lJ • 
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Fig. 2. Sparsity enhancement in i?'"' [i] obtained with the proposed combin 

Due to (|44l) and the orthonormality of the ip'f '' [37], all 
functions in Ai' are (effectively) mutually orthonormal with 
the exception of the DPSSs V'l''^ within the index range 
j = 2Jo + l,...,J— 1, which are not orthonormal to the DFT 
functions. Therefore, we derive the final set of basis functions 
A4 = {t/jQ, . . . by Gram-Schmidt orthonormalization 

[65] of M.'. This amounts to setting -0^ = ^f-Jo ^'^^ * ~ 

0, . . . , 2 Jo and V« = E^=0 CnV-l^J,-, + E'n=2Jo + l ^""^^ ' 

i > 2 Jo + 1, with suitable coefficients c„. It follows that 
ii^ti , V'if > ~ for alHi = 0, . . . , J-1 and = J, • ■ • , 
Hence, the Gram-Schmidt orthonormalization algorithm yields 
'^p^ « ip'f^ for all i = J, . . . , iV^ - 1, i.e., the last Nr - J 
basis functions of Ai are effectively known a priori, and the 
algorithm can therefore be terminated after J steps. In fact, 
only Ji — 1 steps are required, because the first 2Jo + 1 = 
J — Ji + 1 (DFT) basis functions are also known. 

With this construction of the the support of 

[i] = X]^=o ^6"'^'^''""^° V'l* N is approximately contained 
in {0, . . . , J— 1} for all v e [— I'max, I'max]- Furthermore, for 
1 = 0,..., J— Ji, the ipi[n] are DFT basis functions, so that 
the sparsity of corresponds to the sparsity given by 

the DFT basis for these indices i. For the Ji — 1 remaining 
indices i ~ J — Ji + l,...,J— 1 within the support interval, 
we cannot expect any sparsity of However, Ji is quite 

small, so that the overall sparsity of i?'^'''' [i] is not deteriorated 
significantly. 

For Nr ^ NL ^ (2048 + 512)16 = 40960 and ly^e^xTs = 
0.2/i^ = 0.2/2048 (corresponding to a maximum Doppler 
frequency of 20% of the subcarrier spacing). Fig. |2] depicts 
i = 0,..., 20 for i^Ts = 0.115/K = 0.115/2048. 
For comparison, 

\ (obtained with a pure DFT basis) and 
l^'^'WI (obtained with a pure DPSS basis) are also shown. We 
see that the proposed DFT-DPSS basis leads to the sparsest 
result: for the pure DPSS basis, there is no sparsity within the 
support interval, while for the pure DFT basis, the sparsity is 
impaired by a strong leakage effect. 

VIII. Simulation Results 

Next, we demonstrate the performance gains that can be 
achieved with our sparsity-enhancing basis expansions and 



DFT-DPSS basis, relative to a pure DFT basis and a pure DPSS basis. 

estimation of ISI/ICI channel coefficients, relative to the basic 
compressive estimator We show results for three different 
recovery algorithms, namely. Lasso (equivalent to BP denois- 
ing), OMP, and CoSaMP 

A. Simulation Setup 

MC system parameters. We simulated CP-OFDM systems 
with K e {512, 1024, 2048} subcarriers and CP length ratio 
{N — K)/K = 1/4. The systems employed 4-QAM symbols 
with Gray labeling, a rate-1/2 convolutional code, and 32x16 
row-column interleaving. The interpolation/anti-aliasing filters 
fi{t) = f2{t) were chosen as root-raised-cosine filters with 
roll-off factor /0=l/4. 

Recovery method. For Lasso, we used the corresponding 
MATLAB function from the toolbox SPGLl [72]. The re- 
quired regularization parameters were found by trial and error. 
CoSaMP requires a prior estimate of the sparsity of x. In 
all simulations of Section IVIII-BI we used the fixed sparsity 
estimate S = 262, which was determined via the formula 
S = [(5/(2 log il/)] suggested in [47], where we set Q = 
l^l = 2048. (Note that in most scenarios where CoSaMP 
was applied, we actually used 2048 pilots.) The number of 
CoSaMP iterations was 15. For OMP, we also used the sparsity 
estimate S = 262 (and, hence, 262 iterations), except for the 
strongly dispersive scenario of Section IVIII-CI Therefore, in 
Section IVIII-BI the vectors produced by OMP and CoSaMP 
were exactly S'-sparse with 5* = 262. 

Channel. We simulated and estimated the channel during 
blocks of L transmitted OFDM symbols (L will be specified 
in the individual subsections). For a more realistic simula- 
tion, the channel contained a diffuse part in addition to a 
sparse (specular) part, with 20 dB less total power than for 
the sparse part. The scattering function of the diffuse part 
was bricked-shaped within a rectangular delay-Doppler region 
{0, . . . , Ji:/4 - 1} X [-i^maxTs, t-maxTs]- The discrctc-delay- 
Doppler spreading function Sh[m^i] of the sparse part was 
computed from (l2Tl i. We always assumed P = 20 propaga- 
tion paths with scatterer delay-Doppler positions (rp/2^, VpT^) 
chosen uniformly at random within (or within a subset of, 
cf. Section EIILB} {0, . . . , i^/4 - 1} x [-v^.^y^T.^v^^J',] 
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Fig. 3. Performance of compressive estimators versus the SNR: (a) MSB, (b) BER 
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for each block of L OFDM symbols. The scatterer ampli- 
tudes rjp were randomly drawn from zero-mean, complex 
Gaussian distributions with three different variances (3 strong 
scatterers of equal mean power, 7 medium scatterers with 
10 dB less mean power, and 10 weak scatterers with 20 dB 
less mean power). Furthermore, we added complex white 
Gaussian noise z[n] whose variance was adjusted to achieve a 
prescribed receive signal-to-noise ratio (SNR) defined as (cf. 

®) j:n=','^Mn]-z[n]\^}/j:t-o'n\4n]n 

Subsampling and pilots. All estimators employed a subsam- 
pled time-frequency grid with A/\ =4 and AL = 1, on which 
the pilots were selected uniformly at random. 

Performance measures. For all simulations, the performance 
is measured by the mean square error (MSE) normalized by 
the mean energy of the channel coefficients, as well as by the 
bit error rate (BER). 

B. Performance Gains Through Basis Expansions 

We first compare the performance of compressive channel 
estimation using the DFT basis (underlying the basic estimator 
of Section HIH i. the optimized basis of Section |VT] (without 
knowledge of channel statistics), and the combined DFT- 
DPSS basis of Section IVIII The number of subcarriers is 
K = 2048, the blocklength is L = 16, and the maximum 
Doppler frequency is i-'max^^ ~ Q.QZ/K (i.e., 3% of the 
subcarrier spacing). Here, the maximum Doppler frequency is 
quite small; accordingly, the estimator of Section IVII-BI only 
performs its initial iteration (where V = {0}). All estimators 
use the same constellation of jT'l = 2048 pilots, corresponding 
to 6.25% of all symbols. Fig. [3] depicts the performance 
versus the SNR for the three recovery algorithms employed. 
The performance of the optimized basis and the combined 



DFT-DPSS basis is seen to be similar and clearly superior 
to that of the pure DFT basis, especially at high SNR. This 
performance gain is due to the better sparsity achieved, and it 
is obtained even though the coherence of the optimized basis 
(/iv = 2.237) is greater than that of the DFT basis (/iu = 1) and 
the measurement matrix for the combined DFT-DPSS basis is 
not constructed from an (ideally) unitary matrix. The larger 
gap to the known-channel BER performance observed in Fig. 
Ob) at high SNR occurs because (i) the number of pilots is 
too small for the channel's sparsity, and (ii) the OMP-based 
and CoSaMP-based estimators produce 5-sparse signals with 
S ~ 262, which is too small for the channel's sparsity. 

The number of pilots, \V\, is an important design parameter 
because it equals the number of measurements available for 
sparse reconstruction. Fig. |4] depicts the performance versus 
\V\ G {512, . . . , 8192} (corresponding to 1.5625% . . .25% of 
all symbols) at an SNR of 17dB. As a reference, the known- 
channel BER is also plotted as a horizontal line. It is seen 
that, as expected, the performance of all estimators improves 
with growing \V\. The optimized basis and the combined DFT- 
DPSS basis are again superior to the DFT basis. 

Next, we demonstrate performance gains that can be 
achieved by the statistically optimized basis expansion of 
Section IVI-BI The system and channel parameters are K = 
512, L = 64, I'maxTs = Q.Qb/K (5% of the subcar- 
rier spacing), and [P] = 2048 (6.25% of all symbols). 
For the sparse channel part, the 20 scatterer delay-Doppler 
positions i/p7^) now are chosen uniformly at ran- 

dom only within {0, . . . , 127} x ([-O.OS/X, -0.0375/ii'] U 
[0.0375//i, 0.05/if]). This serves as a rough approximation 
to the Jakes Doppler spectrum [73], according to which 
the scatterers are stronger when they are closer to the 
maximum Doppler frequency. In order to optimize the ba- 
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Fig. 5. Performance of DFT-based, detemiinistically optimized, and statistically optimized compressive estimators versus the SNR: (a) MSB, (b) BER. 



sis expansion with this prior statistical knowledge, the pdf 
p{ti,vi) (see Section IVI-Bb is set equal to a constant ci > 
within [0, 127TJ x ([-0.05/(A'Ts), -0.0375/(i5:T,)] U 
[0.0375/(A'Ts),0.05/(i4:Ts)]) and equal to zero outside. The 
variance of rji given (ti,i/i) is assumed constant, i.e., 
(T^(ti,i/i) = C2 > 0. Fig. |5] depicts the resulting per- 
formance versus the SNR. For comparison, we also show 
the performance of the deterministically optimized basis ex- 
pansion, which uses only knowledge of t'max, as well as 
the performance of the DFT basis and the known-channel 
BER performance. The statistically optimized basis is seen 



to outperform the other bases. This can be explained by the 
fact that it reduces the leakage effects occurring within the 
Doppler interval [-Qm7b/{K%),Qm7^/{K%)]. 

C. Performance Gains Through ISI/ICI Coefficient Estimation 

Finally, we assess the performance of the compressive, iter- 
ative, decision-directed estimator of Section [VTIl which is able 
to estimate also off-diagonal (ISI/ICI) channel coefficients. 
We consider a wide range of maximum Doppler frequencies, 
corresponding also to strongly frequency-dispersive channels; 
more specifically, z^maxTs e [0.03/A^ 0.25/ K] or 3% . . . 25% 
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of the subcarrier spacing. The system parameters are K = 
1024, L = 4, SNR=17dB, and \V\ = 128 (i.e., only 3.125% 
of all symbols). There occurs no ISI, only ICI. The estimator 
uses V = {(0, —3), . . . , (0, 3)} for all iterations r > 1, so that 
the ICI equalizer processes the main diagonal plus the first 
three upper and lower off-diagonals. The reliability threshold 
is e = 0.2. For ICI equalization, we use the LSQR equalizer 
proposed in [70], with a fixed number of 15 iterations. Fur- 
thermore, we use OMP with 90 iterations for CS recovery, and 
the combined DFT-DPSS basis of Section [ylTCl 

Fig. |6] depicts the performance of the estimator versus the 
maximum Doppler frequency for iterations up to r = R, 
with R S {0, . . . , 9}. For comparison, the known-channel 
BER performance of conventional one-tap equalization and 
of LSQR-based ICI equalization is also shown. The MSE 
takes into account the estimated diagonal and first three upper 
and lower off-diagonal channel coefficients; it is normalized 
accordingly. For i? = 0, where only the diagonal channel 
coefficients are estimated, the off-diagonal coefficients of the 
estimated channel are set to zero when calculating the MSE. 
It is seen from Fig. |6] that for i? = 0, the performance is 
very poor even for small t^max (weakly dispersive channels). 
This is due to the small number of pilots used. However, 
the performance is improved with an increasing number R 
of iterations, thus demonstrating the benefits of off-diagonal 
coefficient estimation and the use of virtual pilots. The initial 
improvement is slower for larger i^max, again because of the 
small number of pilots. It is furthermore seen that for R = 9 
iterations, for large i^max, the proposed compressive estimator 
is superior to the known-channel performance of one-tap 
equalization. Our results also show that the proposed decision- 
directed method is advantageous not only for coping with 



strongly dispersive channels; it is equally useful for further 
improving the spectral efficiency, even for mildly dispersive 
channels, because of the smaller number of pilots required. 

IX. Conclusion 

We considered the application of compressed sensing tech- 
niques to the estimation of doubly selective multipath chan- 
nels within pulse-shaping multicarrier systems (which include 
OFDM systems as a special case). The channel coefficients on 
a subsampled time-frequency grid are estimated in a way that 
exploits the channel's sparsity in a dual delay-Doppler domain. 
We demonstrated that this delay-Doppler sparsity is limited 
by leakage effects. For combating leakage effects and, thus, 
enhancing sparsity, we proposed the use of an explicit basis 
expansion that replaces the Fourier transform used in the basic 
compressive channel estimation method. We also developed 
an iterative basis design algorithm, and we extended our basis 
design to the case where prior statistical information about the 
channel is available. 

For strongly time-frequency dispersive channels, we then 
presented an alternative compressive channel estimator that 
is capable of estimating the "off-diagonal" channel coeffi- 
cients characterizing intersymbol and intercarrier interference 
(ISI/ICI). Sparsity of the channel representation was here 
achieved by a basis expansion combining the advantages of 
Fourier (exponential) and prolate spheroidal sequences. 

Simulation results demonstrated considerable performance 
gains achieved by the proposed sparsity-enhancing basis ex- 
pansions and by explicit estimation of ISI/ICI channel coef- 
ficients. The additional computational complexity required by 
the basis expansions is moderate; in particular, the bases can 
be precomputed before the start of data transmission. 
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